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Abstract 

The paper deals with asymptotic properties of the adaptive proce- 
dure proposed in the author paper, 2007, for estimating an unknown 
nonparametric regression. We prove that this procedure is asymptot- 
ically efficient for a quadratic risk, i.e. the asymptotic quadratic risk 
for this procedure coincides with the Pinsker constant which gives a 
sharp lower bound for the quadratic risk over all possible estimators. 
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1 Introduction 



The paper deals with the estimation problem in the heteroscedastic non- 
parametic regression model 

y. .SU.) • rr ; f.S)C,. (1.1) 

where the design points Xj = j/n, S(-) is an unknown function to be esti- 
mated, [Cj)i<j< n i s a sequence of centered independent random variables with 
unit variance and (0\ ? -(>S')) 1 < 7 - <ri are unknown scale functionals depending on 
the design points and the regression function S. 

Typically, the notion of asymptotic optimality is associated with the op- 
timal convergence rate of the minimax risk (see e.g., Ibragimov, Hasmin- 
skii,1981; Stone, 1982). An important question in optimality results is to 
study the exact asymptotic behavior of the minimax risk. Such results have 
been obtained only in a limited number of investigations. As to the nonpara- 
metric estimation problem for heteroscedastic regression models we should 
mention the papers by Efroimovich, 2007, Efroimovich, Pinsker, 1996, and 
Galtchouk, Pergamenshchikov, 2005, concerning the exact asymptotic be- 
havior of the £ 2 -risk and the paper by Brua, 2007, devoted to the efficient 
pointwise estimation for heteroscedastic regressions. 

Heteroscedastic regression models are extensively used in the financial 
mathematics, in particular, in problems of calibrating (see e.g., Belomestny, 
Reiss, 2006). Moreover, these models are popular in econometrics (see, for 
e.g., Goldfeld, Quandt, 1972, p. 83), which, for exemple, for consumer budget 
problems, makes use of some semiparametric version of model fll.l[) with the 
scale coefficients of type 

a 2 (5) = c + c lXj + c 2 S 2 ( Xj ) + cj S 2 (t)dt, (1.2) 

where (Cj) 0<i<3 are some unknown positive constants. 

The goal of this paper is to study asymptotic properties of the adap- 
tive estimation procedure proposed in Galtchouk, Pergamenshchikov, 2007, 
for which a non-asymptotic oracle inequality was proved for quadratic risks. 
More precisely, in this paper we show that this procedure is efficient un- 
der some conditions on the scale functions (c r j(S'))i<j< n which hold for the 
functions (jl.2p . Note that in Efroimovich, 2007, Efroimovich, Pinsker, 1996, 
an efficient adaptive procedure is constructed for heteroscedastic regression 
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when the scale coefficient is independent of S, i.e. cr -(S') = a-. In Galtchouk, 
Pergamenshchikov, 2005, for the model ( II. ip . the asymptotic efficiency was 
proved under conditions which are not satisfied in the case (jl.2p . Moreover, 
in the these papers the efficiency is proved only for the gaussian random 
variables (Cj)i<j< n in model (jl.ip . This is very restrictive condition for ap- 
plications. 

In this paper we consider the robust quadratic risk, i.e. in the definition 
of the risk we take the additional supremum over the family of unknown noise 
distributions likely to Galtchouk, Pergamenshchikov, 2006. This modifica- 
tion allows us to eliminate from the risk dependence on the noise distribution. 
Moreover, for this risk the efficient procedure is robust with respect to chang- 
ing the noise distribution. 

As is well known, to prove the asymptotic efficiency one has to show 
that the asymptotic quadratic risk coincides with the lower bound which 
is equal to the Pinsker constant. In the paper two problems are resolved: 
in the first one a upper bound for the risk is obtained by making use of 
the non-asymptotic oracle inequality from Galtchouk, Pergamenshchikov, 
2007, in the second one we prove that this upper bound coincides with the 
Pinsker constant. Let us remember that the adaptive procedure proposed in 
Galtchouk, Pergamenshchikov, 2007, is based on weighted least squares esti- 
mators, where the weights are proper modifications of the Pinsker weights for 
the homogeneous case (when a%(S) = . . . = cr n (S) = 1) relative to a certain 
smoothness of the function S and this procedure chooses a best estimator for 
the quadratic risk among these estimators. To obtain the Pinsker constant 
for the model fll.l[) one has to prove a sharp asymptotic lower bound for the 
quadratic risk in the case when the noise variance depends on the unknown 
regression function. In this case, as usually, we minorize the minimax risk 
by a bayesian one for a respective parametric family. Then for the bayesian 
risk we make use of a lower bound (see Theorem 6.1) which is a modification 
of the van Trees inequality (see, Gill, Levit, 1995). 

The paper is organized as follows. In Section [2] we construct an adaptive 
estimation procedure. Section [3] contains the principal conditions. The state- 
ments of our major results (the upper and lower bounds for quadratic risks) 
are presented in Section HJ The upper bound is proved in Section [5j In Sec- 
tion [6] we give all main steps of proving the lower bound: in Subsection 16 . 1 1 we 
find the lower bound for the bayesian risk for a parametric regression model 
which minorizes the minimax risk; in Subsection 16.21 we study a special fam- 
ily of parametric functions used to define the bayesian risk; in Subsection 16.31 
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we choose a prior distribution for bayesian risk to maximize the lower bound. 
Section [7] is devoted to explain how to use the given procedure in the case, 
when the unknown regression function is non periodic. In Section [8] we dis- 
cuss the main results and their practical importance. The proofs are given 
in Section [9j 



2 Adaptive procedure 

In this section we describe the adaptive procedure proposed in Galtchouk, 
Pergamenshchikov, 2006. To evaluate the error of estimation in the model 
(11. ip we make use of the empiric quadratic norm in the Hilbert space C 2 [0, 1] 
generated by the design points ( x j)i<j< n of model (11.11) . To this end, for any 
functions u and v from C 2 [0, 1], we define the empiric inner product 

1 n 

1=1 

Therefore, the estimation error of an estimator S of S will be evaluated by 
the empiric quadratic loss function 

\\s-s\\l = ±j2W x i)- s ( x M 2 - 

1=1 

Moreover, we make use of this inner product for vectors in MJ 1 as well, i.e. if 
u = (u 1} . . . , u n )' and v = (v v . . . , u )', then 



n 

i , 1 N - 

—u v = — > 
n n z — ' 

l=i 



u i v i 



The prime denotes the transposition. 

Let now {(j>j)j>i b e the standard trigonometric basis in /^2[0, 1], i.e. 

<j> x {x) = 1 , 0,(x) = v / 2Tr i (2 7 r[j/2]x) , j > 2 , (2.2) 

where the function Tr-{x) = cos(x) for even j and TrJx) = sin(x) for odd 
j; [x] denotes the integer part of x. 
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Notice that if n is odd, then the functions (0,-)i<j< n are orthonormal with 
respect to the empiric inner product (I2.ip . i.e. for any 1 < i,j < n, 

fa > = \ = Kr,. , (2.3) 

where Kr- is Kronecker's symbol, Kr- = 1 if i — j and Kr- = for i ^ j. 

Remark 2.1. Note that in the case of even n, the basis (12. 2p is orthogonal 
and it is orthonormal except of the nth function for which the normalizing 
constant should be changed. The corresponding modifications, for even n, one 
can see in Galtchouk, Pergamenshchikov,2005. To avoid these complications, 
we suppose n to be odd. 

Thanks to this basis we pass to the discrete Fourier transformation of 
model (01 : 

*i» h« • • (2.4) 

where 6 jn = (Y, <f>.) n , Y = (y v yj, 6 jn = (S, ^.) n and 

1 - 
v i=i 

We estimate the function S by the weighted least squares estimator 



where the weight vector A = (A(l), . . . , A(n))' belongs to some finite set A 
from [0, l] n with n > 3. 

Here we make use of the weight family A introduced in Galtchouk, Perga- 
menshchikov, 2009, i.e. 

A = {X a ,aeA}, A = {l,...,k*}x{t 1 ,...,t m }, (2.6) 

where t i = is and m = [l/e 2 ]. We suppose that the parameters k* > 1 and 
< e < 1 are functions of n, i.e. k* = k* and e = £„, such that, 



n 

Inn (2.7) 
lim„^^ e„ = and lim n ^ 00 e m = +oo 



lim n ^ oc fc* = +oo , lim^^ = , 
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for any v > 0. For example, one can take for n > 3 



e_ = 1 / In n and k* = k + vh 

n / n. 



n . 



where is any nonnegative constant. 

For each a = t) G ^4 we define the weight vector A Q = (A a (l), . . . , A a (n))' 

as 

Kti) = l {1 <i< io} + (1 - (JM"))') l{ Jo <,<.(a)} • (2-8) 
Here j = j ( a ) = M") £ J with 

w(a) = uJ + t)i/(2/3+i) n i/(2/3+i) ; (2 Q) 

where oJ is any nonnegative constant and 

(/3 + l)(2/3 + l) 

Remark 2.2. iVoie that, the weighted least squares estimators ( 12.5ft /iai>e 6een 
introduced by Pinsker, 1981, for the optimal filtering of a continuous time 
signal in the gaussian noise. It tourned out, that the asymptotic quadratic 
risk for estimators of type (12.51) - (12.81) is minimal over all possible estima- 
tors. This sharp value of the asymptotic quadratic risk is called the Pinsker 
constant. Nussbaum, 1985, makes use of the same method with proper modi- 
fication for efficient estimation of the function S of known smoothness in the 
homogeneous gaussian model (11.11) . i.e. when o~i(S) = . . . = o~ n (S) = 1 and 
(£j)i<?<n ^ s a sequence of i.i.d. jV(0, 1) random variables. 

To choose weights from the set ( 12. 6ft we minimize the special cost function 
introduced by Galtchouk, Pergamenshchikov, 2007. This cost function is as 
follows 

n n 
JnW = E A2 0')^,n " 2 E A ^') + P P nW , (2-10) 

where 

hn = &n--Sn with ? n = E \n ^ 

and l n = [n 1 ^ 3 + 1]. The penalty term we define as 

I \ 1 2 ^ n 1 

p„(A) = ^L5i, |A| 2 = E A2 W and P 



n 3 + L 

i=i 
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where L n > is any slowly increasing sequence, i.e. 

lim L n = +00 and lim % = , (2.12) 

n— >oo n—too Tl 

for any v > 0. 
Finally, we set 

A = argmin AgA J n (A) and = S^j . (2-13) 

The goal of this paper is to study asymptotic properties (as n — > 00) of 
this estimation procedure. 

Remark 2.3. Now we explain why does one choose the cost function in the 
form ( 12.1 Op . Developing the empiric quadratic loss function for the estimator 
( 12. 5p . one obtains 

n n 
WSx -S\\l = £ X\j)P. n - 2 mhnhn + l^lln- 

j=i i=i 

It's natural to choose the weight vector A for which this function reaches the 
minimum. Since the last term on the right-hand part is independent of X, it 
can be dropped and one has to minimize with respect to A the function equals 
to the difference of the two first terms on the right-hand part. It's clear 
that the minimization problem can not be solved directly because the Fourier 
coefficients (0j n ) are unknown. To overcome this difficulty, we replace the 

product 0j n 0j n by its asymptotically unbiased estimator 6- n . Moreover, to 
pay this substitution, we introduce into the cost function the penalty term 
P n with a small coefficient p > 0. The form of the penalty term is provided 
by the principal term of the quadratic risk for weighted least-squares estima- 
tor, see Galtchouk, Pergamenshchikov, 2007, 2009. The coefficient j> > 
is small, because the estimator n is close in mean to the quantity n • n 
asymptotically, as n — >■ 00. 

Note that the principal difference between the procedure ( 12 . 13[) and the 
adaptive procedure proposed by Golubev, Nussbaum, 1993, for a homogeneous 
gaussian regression, consists in presence of the penalty term in the cost func- 
tion (I2~l0|) . 

Remark 2.4. As it was noted in Remark \2.2\ Nussbaum, 1985, has shown 
that the weight coefficients of type (12. 8p provide the asymptotic minimum of 
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the quadratic risk in the estimation problem of the regression function for the 
homogeneous gaussian model (II. ip . when the smoothness of the function S is 
known. In fact, to obtain an efficient estimator one needs to take a weighted 
least squares estimator ( 12. 5 j) with the weight vector X a , where the index a 
depends on the smoothness of the function S (the parameters k and r in (13. ip ) 
and on the coefficients (cr J ( t S')) 1<J<Tl , (the paramter q(S) in (13.61) ), which are 
unknown in our case. For this reason, Galtchouk, Pergamenshchikov , 2007, 
have proposed to make use of the family of coefficients (I2.6p . which contains 
the weight vector providing the minimum of the quadratic risk. Indeed, the set 
( 12. 6p is a two-dimensional grille giving all possible values for the parameters 
k and f(S) = r/q(S) in the estimator (12. 5p with the weight ( 12. 8 p which is 
efficient if the parameters k and f(S) are known, i.e. the set A gives the 
familly of efficient estimators. Moreover, under some weak conditions on the 
coefficients (c r 7 -(S'))i<j< n , Galtchouk, Pergamenshchikov , 2007, shown that 
the procedure (12 . 13[) is best in the class of these estimators in the sens of the 
non- asymptotic oracle inequality (see, Theorem below). It is important 
to note that, due to the conditions ( \2.7h for the set A, the secondary term in 
the oracle inequality is slowly increasing (slower than any degree of n). 



3 Conditions 

First we impose some conditions on the function S in the model ( ll.ip . 

Let Cp erl (IR) be the set of 1-periodic k times different iable R. — > R func- 
tions. We assume that 5* belongs to the following set: 

k 

W k r = {f E C k per ^W) : Wf U) W 2 <r}, (3.1) 

3=0 

where || • || denotes the norm in £ 2 [0, 1], i.e. 

f(t)dt. (3.2) 



Moreover, we suppose that r > and k > 1 are unknown parameters. 
Note that the set W k can be represented as an ellipse in £ 2 [0, 1], i.e. 

oo 

W r k = {fe C 2 [0,1] : ^a^ 2 <r}, (3.3) 
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where 



and 



1=0 



f(t)<f>At)dt 



(3.4) 



2/ 



(3.5) 



i=0 



Here {4>j)j>\ is the trigonometric basis defined in (12.21) . 

Now we describe the conditions on the scale coefficients (cr-(S)) - >:L . 

H x ) CTj(S) = g(Xj, S) for some unknown function g : [0, 1] x 1] — > 
which is square integrable with respect to x and such that 



lim sup 

- 1 .2 



1 - 



(3.6) 



where q(S) := L g 2 (x,S)dx. Moreover, 



g* = inf inf # (ay, 5) > 



(3.7) 



and 



sup ^(5) < oo 

Sew!! 



(3.8) 



H 2 ) For any x G [0, 1], the operator g 2 (x, •) : C[0, 1] — )■ R is differentiable 
in the Frechet sense at any fixed function f from C[0, 1] , i.e. for any 
f from some vicinity of f in C[0, 1], 

g 2 (x, f) = g 2 (x, f ) + L xjQ (f - f ) + T(x, f , f) , 

where the Frechet derivative ~L X * : C[0, 1] — > R is a bounded linear 
operator and the residual term t(x, / , /), for each x G [0, 1], satisfies 
the following property: 



lim 



|T(s, / 0J /)| 



where 



ll/-/olL->° \\f-f \ 

su p <t<i !/(*)!■ 
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H 3 ) There exists some positive constant C* such that, for any function S 
from C[0, 1], the operator ~L X s defined in the condition H 2 ) satisfies the 
following inequality, for any function f from C[0, 1]: 

\Ks(f)\ < C* (\S(x)f(x)\ + \f\ 1 + \\S\\\\f\\) , (3.9) 

where \f\ = £ |/(f)|df. 

H 4 ) The function g (-) = g(-, S Q ) corresponding to S = is continuous on 
the interval [0, 1]. Moreover, 

lim sup sup \g(x, S) — g(x, S )\ = 0. 

5 ->o <x<l H^H^^tJ 

Remark 3.1. Let us explain the conditions H 1 )-H 4 ) which are the reg- 
ularity conditions of the function g(x,S) generating the scale coefficients 

(<7j( s ))i<p< n - 

Condition H 1 ) means that the function g(-,S) must be uniformly inte- 
grable with respect to the first argument in the sense of convergence (13. 6p . 
Moreover, this function must be separated from zero (see inequality (13.71) ) 
and bounded on the class (13.11) (see inequality (13.81) ). Boundedness away 
from zero provides that the distribution of observations (yj)i<j< n isn't degen- 
erate in M. n , and the boundedness means that the intensity of the noise vector 
must be finite, otherwise the estimation problem has not any sense. 

Conditions H 2 ) and H 3 ) mean that the function g(x, ■) is regular, at any 
fixed < x < 1, with respect to S in the sense, that it is differentiate in 
the Frechet sense (see e.g., Kolmogorov, Fomin, 1989) and, moreover, the 
Frechet derivative satisfies the growth condition given by the inequality (13.91) 
which permits to consider the example fll.2|) . 

The last condition H 4 ) is the usual uniform continuity condition of the 
function g(-,-) at the point S = 0. 

One check directly that the function (II. 2p satisfies the conditions H x )- 
H 4 ). Another functions satisfying these conditions are given in Galtchouk, 
Pergamenshchikov , 2008. 
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4 Main results 



Denote by V n the family of distributions p in M. n of the vectors . . . , £ n )' in 
the model (II. ip such that the components £ ■ are jointly independent, centered 
with unit variance and 

max E^<r, (4.1) 

where I* > 3 is slowly increasing sequence, i.e. it satisfies the condition 
(12.12p . It is easy to see that, for any n > 1, the centered gaussian distribution 
in W 1 with unit covariation matrix belongs to the family V n . We will denote 
by q this gaussian distribution. 

For any estimator S, we define the following quadratic risk 

TZ n (S,S) = sup E s jS-S\\ 2 n , (4.2) 

where E 5p is the expectation with respect to the distribution P 5 of the 
observations (y 1 , . . . , y n ) with the fixed function S and the fixed distribution 
p £ V n of random variables (£j-)i<y<n m the model fll.ip . 

Moreover, to make the risk independent of the design, we will make use 
of the risk with respect to the usual norm in C 2 [0, 1] (13. 2 p too, i.e. 

T n (S,S)= sup E s p \\S- S\\ 2 . (4.3) 

If an estimator S is defined only at the design points (^)i<j< n , then we 

extend it as step function onto the interval [0, 1] by setting S(x) = T(S(x)), 
for all < x < 1, where 

n 

T(f)(x) = fMl^ix) + f( x k)l {xk _ uXk] (x) ■ (4.4) 

k=2 

In Galtchouk, Pergamenshchikov, 2007, 2009 the following non-asymptotic 
oracle inequality has been proved for the procedure (12.131) . 

Theorem 4.1. Assume that S G W 1 for some r > 0. Let be from (12.131) . 
Then, for any odd n > 3, the following oracle inequality holds: 

K n &, S) < 1 + 3p ~ V mm TZ n (S x , S) + - B n {p) , (4.5) 
1 — op AeA n 
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where p = 1/(3 + L n ), L n is from ( I2.12p . the function B n (p) is such that, for 
any v > 0, 

hm*M = 0. (4.6) 

Remark 4.1. Note that in Galtchouk, Pergamenshchikov, 2007, 2009, the 
oracle inequality is proved for the model (II. ip . where the random variables 
(£j-)i<y<n o^e independent identically distributed. In fact, the result is true 
for independent random variables which are not identically distributed, i.e. 
for any distribution of the random vector . . . , £ n )' from V n . 

Now we formulate the main asymptotic results. To this end, for any 
function S G W k , we set 

lk (S) = VI r-( 2fc+1 ) {<,{S)) 2k l^ , (4.7) 

where 

Tt = (2k + l)-( 2fc+1 ) (k/(n (k + l))) 2fe /( 2fc+1 ) . 

It is well known (see e.g., Nussbaum, 1985) that the optimal rate of conver- 
gence is 7^ 2fc /( 2fc+1 ) when the risk is taken uniformly over WK 

Theorem 4.2. Assume that in the model f ll.ip the sequence (cr J (S')) satisfies 
the condition Hi). Then, for any integer k > 1 and r > 0, the estimator 
from ( I2.13P satisfies the inequalities 

hmsup sup nV * ' < 1 (4.8) 



and 



limsupn^f sup Z^i^jj <i. (4.9) 



The following result gives the sharp lower bound for the risk ( I4.2p and 
show that 7^(5') is the Pinsker constant. 

Theorem 4.3. Assume that in the model fll.ip the sequence (a-(S)) satisfies 
the conditions H2)- H4). Then, for any integer k > 1 and r > 0, the risks 
(14.21) and (14. 3 p admit the following asymptotic lower bounds, respectively, 

liminf inf sup ^"^"'.^ > 1 (4.10) 

r 

12 



and ^ 

liminf n^rr inf sup 7 "^"'.'^ >1. (4.11) 

Remark 4.2. To obtain the non- asymptotic oracle inequality ( 14.5ft . it is 

not necessary to make use of equidistant design points and the trigonometric 
basis. One may take any design points (deterministic or random) and any or- 
thonormal basis satisfying (12. 3p . But to obtain the property (14.61) one needs to 
impose some technical conditions (see Galtchouk, Pergamenshchikov, 2009). 

Note that the results of Theorem \4-2\ and Theorem \4-3\ are based on 
equidistant design points and the trigonometric basis. 



5 Upper bound 

In this section we prove Theorem 14.21 To this end we will make use of 
the oracle inequality (14.51) . We have to find an estimator from the family 
(I2.5l) - ()2.6p for which we can prove the upper bound (14. 8p . We start with the 
construction of such an estimator. First we put 

T n = inf{z > 1 : ie > r(S)} A m and f(S) = r/s(S) , (5.1) 

where a A b = min(a, b). Then we choose an index from the set A as 

a = (k,t n ), (5.2) 

where k is the parameter of the set W k and t n = l n e. Finally, we set 

S = and A = A 5 . (5.3) 
Now we formulate the upper bound (14.81) for this estimator. 

Theorem 5.1. Assume that the condition Hi) holds. Then, for any integer 
k > 1 and r > 0, 

hmsup n2fc+i sup — — <1. (5.4) 

n-^oo SeWfi Jk{S) 



13 



Remark 5.1. Note that the estimator S belongs to the family (I2.5l) - ()2.6p . but 

we can not use directly this estimator because the parameters k, r and f(S) 
are unknown. We can use this upper bound only via the oracle inequality 
( 14. 5 p proved for the procedure ( 12 . 1 3[) . 

Now Theorem 14.11 and Theorem 15.11 imply the upper bound (14.81) . To 
obtain the upper bound (14.91) one needs the following auxiliary result. 

Lemma 5.2. For any < 5 < 1 and any estimator S n of S G W k , 

\\S n -S\\l>{l-8)\\T n {S)-Sf - (r 1 - l)r/n\ 
where the function T n (S)(-) is defined in (I4.4p . 

Proof of this lemma is given in Galtchouk, Pergamenshchikov, 2008 ( Ap- 
pendix A.l). 

Now the inequality ( 14. 8 p and this lemma imply the upper bound ( 14. 9p . 
Hence Theorem 14.21 

6 Lower bound 

In this section we give the main steps of proving the lower bounds (14. 10p and 
( 14. lip . We follow the commonly used scheme (see e.g. Nussbaum, 1985). 
We begin with minorizing the minimax risk by a bayesian one constructed 
on a parametric functional family introduced in Section I6T21 ( see ( I6.9P ) and 
using the prior distribution ( I6.10p . Further, a special modification of the van 
Trees inequality (see, Theorem 16. ip yields a lower bound for the bayesian risk 
depending on the chosen prior distribution, of course. Finally, in Section 16. 3[ 
we choose parameters of the prior distribution (see ( I6.10p ) providing the 
maximal value of the lower bound for the bayesian risk. This value coincides 
with the Pinsker constant as it is shown in Section |9~41 We emphasize that, by 
making use of the bayesian risk, one passes from the nonparametric regression 
model to a parametric one, for which the van Trees inequality holds. Note 
that this inequality is an extension of the Cramer-Rao inequality and gives 
a lower bound for the bayesian risk. 
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6.1 Lower bound for parametric heteroscedastic re- 
gression models 

Let (R n ,i3(R n ),P^ E C R l ) be the statistical model relative to the 
observations (y,-) !<,■<„ governed by the regression equation 

y 3 = S^xJ+affltj, (6.1) 

where £i, . . . , £ n are i.i.d. jV(0, 1) random variables, d = . . . ,-#/)' is a 
unknown parameter vector, S^(x) is a unknown (or known) function and 
cr - (i?) = g(Xj,S#), with the function g(x,S) defined in the condition H 1 ). 
Assume that a prior distribution of the parameter i? in R' is defined by 
the density $(•) of the following form 

= $(zi,...,z,) = n^<)> 

i=i 

where ^ is a continuously different iable bounded density on R with 



¥>i(«) 

Let r(-) be a continuously differentiable R' — > R function such that, for any 
1 < i < I, 

lim r(^)^(^) = and / \r'.(z) \ $(z)dz < oo , (6.2) 
hl->°° ' Jr 1 

where 

t#*) = (0/0*,) r(*) . 

Let r n be an estimator of r(i?) based on observations {yj)\<j< n - For any 
£>(R n x R z ) - measurable integrable function Gr(x, z), x e R™, -2 G R z , we set 



E 



G(y,0) = f E z G(Y,z)<S>(z)dz, 
Jr' 



where is the expectation with respect to the distribution P tf of the vector 
Y — (y 1: . . . , y n ). Note that in this case 



,G(y,0) = / f(v,&) dv, 
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where 

/( "- z)= n7i^w exp i — ^w— /■ (6 - 3) 

We prove the following result. 

Theorem 6.1. ^sswrne that the conditions Hi) — H 2 ) hold. Moreover, as- 
sume that the function S z (-) with z = (z x , . . . , z t )' is uniformly overO < x < 1 
differentiable with respect to z i} 1 < % < Z, ie. /or any 1 < % < I, there exists 
a function S' . G C[0, 1] such that 



lim max 

h^O 0<x<l 



S z+hei (x) - S z (x) - S'Jx)h) /h\=0, (6.4) 



where e { = (0, 1, ...,0)' ; all coordinates are 0, except the i-th equals to 1 . 
Then for any square integrable estimator r n of t{$) and any 1 < i < I, 



r 2 



*ft-rm'> Fi + i i + Ii , (6-5) 
where r t = / Rl r[{z) $(z)dz ; F i = £J =1 J Rl (S'Jx^/a^z)) 2 $(z)dz and 



1 " r L z (x,,S) 



h^Xjz) =h xS (S' zi ), the operator ~L X s is defined in the condition H2). 

Remark 6.1. Note that the inequality (16. 5p is some modification of the van 
Trees inequality (see, Gill, Levit, 1995) adapted to the model (16. ip . 



6.2 Parametric family of kernel functions 

In this section we define and study some special parametric family of kernel 
function which will be used to prove the sharp lower bound (14.101) . 
Let us begin by kernel functions. We fix 77 > and we set 

f 1 %i — x \ 

X v (x) = if / l ( | u |<i_^ V du , (6.6) 

Jr V V J 
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where 1a is the indicator of a set A, the kernel V G C°°(M) is such that 
V(u) = for \u\ > 1 and / V(u) du — 1 . 



-i 



It is easy to see that the function x v ( x ) possesses the properties : 

< x v — 1 ) Xn( x ) = 1 f° r \ x \ — 1 — 2?7 and 
X„(^) — for \x\ > 1 . 



Moreover, for any c > and z/ > 



lim sup 

r '^° /: ll/lloo<c 



/(x)x^(^)dx - / f(x)dx 



-i 



0. (6.7) 



We divide the interval [0, 1] into M equal subintervals of length 2h and on 
each of them we construct a kernel-type function which equals to zero at the 
boundary of the subinterval together with all derivatives. This provides that 
the Fourier partial sums with respect to the trigonometric basis in C 2 [— 1, 1] 
give a natural parametric approximation to the function on each subinterval. 

Let (e,-) 3>1 be the trigonometric basis in £ 2 [— 1, 1], i.e. 

e x = l/y/2, ej (x) = Tr 3 (tt[j/2}x) , j > 2 , (6.8) 

where the functions (Tr J ) J>2 are defined in ( 12. 2ft . 

Now, for any array z = {(z m j) 1<m<M ,i<j<N } we define the following 
function 

S t , n {x) = J2J2 Z m,j D m,j( X ) » ( 6 - 9 ) 
m=l j=l 

where D m j (x) = e j (v m (x)) x v (v m (x)), 

v m (x) = x Xm ( x m = 2mh n and M n = [1/ (2/i n )] - 1 . 

We assume that the sequences (N n ) n>1 and (h n ) n>1 , satisfy the following 
conditions. 

A x ) The sequence N n — > oo as n — >■ oo and, /or any z/ > 0, 

lim — 2- = 0. 
17 



Moreover, there exist < 8 X < 1 and 5 2 > swc/i t/ia£ 

/i n = 0(n~ Sl ) and h~ x = 0(n &2 ) as n ->• oo . 

To define a prior distribution on the family of arrays, we choose the following 
random array -d = {($ m ,j)i<m<M n ,i<j<N n } with 

^mj = t m j(,m,ji (6.10) 

where (C m ,j) are i-i-d. Af(0, 1) random variables and (t m j) are some non- 
random positive coefficients. We make use of gaussian variables since they 
possess the minimal Fisher information and therefore maximize the lower 
bound (16. 5p . We set 

J'=l 

We assume that the coefficients (t mt j) 1<m<M ±<j<N satisfy the following 
conditions. 

A 2 ) There exists a sequence of positive numbers {d n ) n>1 such that 



1 n n 

i-^EE 0' 2(fc-1) = • lim = o , (6.i2) 

n m=l j=l 



h 2k 

n 

moreover, for any v > 0, 



lim n y exp{-d n /2} = 0. 



A 3 ) For some < e < 1 



" m=l ?=1 



A 4 ) There exists e > such that 



m=l J=l 
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Proposition 6.2. Let the conditions A 1 )-A 2 ). Then, for any v > and 

for any 5 > 0, 

lim n u max P (\\S$A\ > d) = . 

n->oc 0<l<k-l V ' / 

Proposition 6.3. Let the conditions A 1 )-A 4 ). Then, for any v > 0, 

lim n"P(&„^W*) =0. 



Proposition 6.4. Lei t/ie conditions A 1 )-A 4 ). Then, for any v > 0, 
lim n"E||S0 f + lgc) =0. 



Proposition 6.5. Let the conditions A 1 )-A 4 ). Then, for any function g 
satisfying the conditions (13.71) and H 4 ), 

lim sup E | g~ 2 (x, S# ) - g~ 2 (x) \ = . 

n-^oo o<x<l 

Proofs of Propositions l6.2H6.5l are given in Galtchouk, Pergamenshchikov, 
2008 (Appendix A.2-A.6). 



6.3 Bayes risk 

Now we will obtain the lower bound for the bayesian risk that yields the 
lower bound (14. lip for the minimax risk. 

We make use of the sequence of random functions (S$ n ) n>1 defined in 
(16. 91) - (16. 101) with the coefficients (t m j) satisfying the conditions A 1 )-A 4 ) 
which will be chosen later. 

For any estimator S n we introduce now the corresponding Bayes risk 

£ n (S n )= [ E^JK - S,J Vtf(cb) , (6.13) 

where the kernel family (S zn ) is defined in (I6.9p . /i^ denotes the distribution 
of the random array defined by flBTTUj) in R l with I = M n N n . 

We remember that q is a centered gaussian distribution in IR n with unit 
covariation matrix. 
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First of all, we replace the functions S n and S by their Fourier series with 
respect to the basis 

e m ,i( x ) = (Vv^) e, {v m (x)) l(|^ m( x-)|<i) • 

By making use of this basis we can estimate the norm \\S n — S zn \\ 2 from 
below as 

m=l j=l 

where 

r mJ = [ S n {x)e mj {x)dx and r mJ (z) = [ S z n {x)e m j {x) dx . 
Jo Jo 

Moreover, from the definition (I6.9P one gets 

N n ,1 

Z rn,i \ 
i=l J - 1 



T m,j{ Z ) 



e i( U ) e j( U )X v ( U ) du - 



It is easy to see that the functions r m ■(•) satisfy the condition (16. 2p for 
gaussian prior densities. In this case (see the definition in (16. 51) ) we have 



T m,j = {d/dz m)j )T mj (z) = Vhe^x 

where 



e 7 (/)= / e){v)f{v)dv. (6.14) 



Now to obtain a lower bound for the Bayes risk 8 n (S n ) we make use of 
Theorem 16.11 which implies that 

^ Ae 2 ( X „) 

g„(5„)>EE f + b +t -' - (6 ' 15) 

m=l j=l m,j T m,j m,j 



where F mj = ^WEr 2 ^, S,,J and 

1 ™ L 2 .(x-,S ) ~ 
= o El E ™A( T "c With Lm ^ X ' 5) = L -- 5 • 



n 4 (x.. 
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By making use of Proposition 16.51 we can show that 



lim sup sup 

n-+oc l< m <M n l<j<N n 



and 



lim 



sup sup B 



n^oo nh l< m <M n l<j<N n ^ 







(6.16) 



(6.17) 



The detailed proof of these equalities is given in Galtchouk, Pergamen- 
shchikov, 2008 (Appendix A. 8). 

This means that, for any v > and for sufficiently large n, 

sup sup _ 2 2 < 1 + V , 

i< m <M n i<j<N n nhe^m W + t m ,j 
where x m is defined in (16. 9p . Therefore, if we denote in (16.151) 



nhg Q (x m )t mj and fyfa y) = 

we obtain, for sufficiently large n 

2k n 2k + 1 sr^ 

m=l 

Moreover, the property (16. 7p implies 



9q {%m, 



3=1 



lim sup sup 

v-+o N>1 ( yi ,..., yN )eR» 



*jv(z/i,---,I/jv) 



where 



3=1 



Therefore we can write that, for sufficiently large n, 



2k I — V 

n\ n) - 1 + 



1+1 E 5 2 W *iV n (<!,■■■ 



(6.18) 



(6.19) 



m=l 
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Obviously, to obtain a "good" lower bound for the risk £ n (S n ) one needs 
to maximize the right-hand side of the inequality f )6.19p . Hence we choose 
the coefficients .) by maximizing the function $? N , i.e. 

N 

max ^ N (y v ...,y N ) subject to Y] y^ 2k < R . 
Vi,-,vn . , 

The parameter R > will be chosen later to satisfy the condition A 3 ). By 
the Lagrange multipliers method it is easy to find that the solution of this 
problem is given by 

y*(R) = a*(R)r k -l (6.20) 

with 

a*(R) = -J^(R + jriA and 1 < j < N . 

To obtain a positive solution in (I6.20p we need to impose the following 
condition 

N N 

r> N k J2 ik -J2 i2k ■ ( 6 - 21 ) 

i=l i=l 

Moreover, from the condition A 3 ) we obtain that 

2 2k+1 (l -e)rnh 2k+1 
R < T^k — := R* , (6.22) 

where 

m=l 

Note that, by the condition H 4 ), the function g (-) = g(-,S ) is continuous 
on the interval [0, 1], therefore, 

lim g = g 2 (x, S )dx = s(S ) with S = . (6.23) 

Now we have to choose the sequence (h n ). Note that if we put in (I6.10p 



g (x^^j^)/^h~ n i.e. K mJ =y*(R), (6.24) 
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we can rewrite the inequality f)6.19p as 



^ 2fc+1 ^n(^n) > 2/T " 2 * +1 > ( 6 - 25 ) 



where 



^JR) = N- 



It is clear that 

k 2 /(k + l) 2 < liminf inf V* N (R)/N < limsupsup*^(i2)/JV < 1 . 

JV-kxj R>0 yv^oo R>0 

Therefore, to obtain a positive finite asymptotic lower bound in (16.251) we 
have to take the parameter h n as 

h n = h,n-yV k+ VN n (6.26) 

with some positive coefficient h^. Moreover, the conditions fl6.2ip - fl6.22p im- 
ply that, for sufficiently large n, 



22fc+l ^ 



'1 _ e v — h 2k+1 > V ? fc — V "' 2/ 



Now taking into account that, for sufficiently large n, 



N 



we obtain the following condition on ft^: 

K>{vlf^ k+1 \ (6.27) 

where 

and c *_ 2 2k+ \l-e)r 
e c*{k + \){2k + 1) £ n 2 H{S Q ) ' 

To maximize the function ty* (R) on the right-hand side of the inequality 
f!6.25p . we take R = R* defined in f )6.22p . Therefore, we obtain that 

liminf inf n 2k l {2k+ ^ £ n {S n ) > s(S ) F(h m )/2 , (6.28) 

n— >oo s 
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where 

pn = 1 2^+1 

W a; (A; + l) 2 (c*(2A; + l)x 2fc + 2 + x) ' 

Furthermore, taking into account that 

(c*(2£; + l)(A; + l)x 2fc+1 -£;) 2 

F'(x) = — — - — < 

K J (k + l) 2 (c*(2fc + l)x 2fc + 2 + x) 2 ~ 



we get 



max F(hJ = F((v*) 1/(2k+1) ) = ^ + £ ' )k (^-V^+D 



^>(^) 1/(2fc+1) * £ & + 1 

where 

e' = £ - . (6.29) 

This means that to obtain in (16.281) the maximal lower bound, one has to 
take in (16T26I) 

h, = ( v * £ ) 1/{2k+1) . (6.30) 

It is important to note that if one defines the prior distribution in the 
bayesian risk fl6TT3l) by formulas (IQljl . (16T241) . ( 16T26D and (IQljl . then the 
bayesian risk should depend on a parameter < e < 1, i.e. £ n = £ £n . 
Therefore, the inequality (I6.28P implies that, for any < £ < 1, 

liminf inf n 2fc /( 2fc+1 <„(SJ > 1 + n | l J k (S ) , (6.31) 

™->oo S„ \ i - / 

where the function 7fc(5' ) is defined in (14. 71) for Sq = 0. 

Now to end the definition of the sequence of the random functions (S# n ) 
defined by (I6.9P and (I6.10p . one has to define the sequence (N n ). Let us 
remember that we make use of the sequence (S^ n ) with the coefficients (t m J) 
constructed in (I6.24[) for R = R* given in (16.221) and for the sequence h n given 
by (RT2SJ) and ( jOQD , for some fixed < e < 1. 

We will choose the sequence (N n ) to satisfy the conditions A 1 )-A 4 ). One 
can take, for example, N n = [ln 4 n] + 1. Then the condition A x ) is trivial. 
Moreover, taking into account that in this case 

^ _ 2 2fc+1 (l - e)r ^^ T2k+1 _ q(S ) k N 2k+i 

n ' -K 2k g 6 11 g (Jfe + l)(2ife + l) n 
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we find, thanks to the convergence (I6.23p . 

p* , sr^N n -2k 

lim n V =1. 

n 'j=l J 

Therefore, the solution (16.201) . for sufficiently large n, satisfies the following 
inequality: 

max y*(R*)i k < 2N k . 
i<j<N n i n ~ 

Now it is easy to see that the condition A 2 ) holds with d n = \jN n and the 
condition A 4 ) holds for arbitrary < e < 1. As to the condition A 3 ), note 
that, in view of the definition of t m j in ( 16.24[) . we get 

M N N 

ftik-i j = 2nh 2k+1 So ^ ^ ^ 

n m=l j=l n j=l 

'I - e r - 



N 2k+1 2v* ' 7 V 7T 

n e 



Hence the condition A 3 ). 



7 Estimation of non periodic function 

Now we consider the estimation problem of a non periodic regression function 
S in the model fll.ip . In this case we will estimate the function 5* on any 
interior interval [a, b] of [0, 1], i.e. for < a < b < 1. 

It should be pointed out that at the boundary points x = and x — 1, 
one must to make use of kernel estimators (see Brua, 2007). 

Let now x be a infinitely differentiate [0, 1] — > M + function such that 
x(x) = 1 for a < x < b and X (0) = X (1) — for all k > 0, for example, 

xix) = ~ J v (^) howW**, 

where V is some kernel function introduced in (16. 6ft . 

/ a a b 1 , 1 . . 1 

a — — , o = — I — and ri = - mm a , l — b . 
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Multiplying the equation (II. ip by the function x(') an d simulating some 
i.i.d. Af(Q, 1) sequence (Cj)i<j< n one comes to the estimation problem of the 
periodic regression function S(x) = S(x)x(x) in the model 

y j = S(Xj) +aJS)L , 



2 



where <Jj(S) = yj cr 2 .{S) + e 

~ <tAS) e 

and e > is some sufficiently small parameter. 

It is easy to see that if the sequence (c r J (5')) 1<J<ri satisfies the conditions 
HJ - H 4 ), then the sequence (c r j(S , )) 1<J - <ra satisfies these conditions as well 
with 

= 9( x ji s ) = \/9 2 (x j ,S)x 2 (x j ) 



+ e 2 . 



8 Conclusion 

In conclusion, it should be noted that this paper completes the investigation 
of the estimation problem for the nonparametric regression function in the 
heteroscedastic regression model (II .ip in the case of quadratic risk. It is 
shown that the adaptive procedure (I2.13P satisfies the non asymptotic oracle 
inequality and it is asymptotically efficient for estimating a periodic regres- 
sion function as well. From practical point of view, the procedure ( 12. 13[) gives 
an acceptable accuracy even for small samples as it is shown via simulations 
by Galtchouk, Pergamenshchikov, 2009. 



9 Proofs 

9.1 Properties of the trigonometric basis 
Lemma 9.1. For any function S G Wf, 



r 
it 



sup sup m 2k I Yl e l) < J^- O-l) 

n>\ \<rn<n—\ \ . . / 



\j=m-\-l 
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Lemma 9.2. For any m > 0, 



sup sup N m 

n>2 xe[o,i] 



N 



< 2 m . 



(9.2) 



Proofs of Lemma 19.11 and Lemma 19.21 are given in Galtchouk, Pergamen- 
shchikov, 2007. 

Lemma 9.3. Let 9- n and 6^ be the Fourier coefficients defined in ( 12.41) and 
(13.41) . respectively. Then, for 1 < j < n and n > 2, 



sup \9 jtn -0j\ < 2-Ky/rj/n. 



sew? 

Proof. Indeed, we have 



(9.3) 



\0j,n- d j\ 



1=1 Jx l-i 



n 



< / [\^)M z )\ + \S(z)^(z)\) dz 

1=1 J H-i 

= n- 1 J X (\S{z)\\^{z)\ + \S{z)\\^{z)\)dz. 



\d hn -e 3 \< n- 1 f||5| 



By making use of the Bounyakovskii-Cauchy-Schwarz inequality we get 

US'! 

< n- 1 (\\S\\ + ttj ||5|| j . 
The definition of the class W} implies (19. 3p . Hence Lemma [9. II □ 

9.2 Proof of Theorem [5J] 

To prove the theorem we will adapt to the heteroscedastic case the corre- 
sponding proof from Nussbaum, 1985. 

First, from (I2.5P we obtain that, for any p e !P n , 



n i n 

e*, p \\s-s\\i=j2(i - \fo 2 hn +-E ~ X hn{S) 

i=i j'=i 



(9.4) 
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where 

1 n 

S,n(^) = -E^)^)- 



n 
1=1 



Setting now uj = u(a) with the function u defined in (12. 9p . the index a 
defined in (jOJ), jo = [we n ], ji = [w/ej and 

we rewrite (19. 4 p as follows 

*sjS-S\\l= £ (1- (9.5) 



i=io+i 

n 



i=i 



with 



n 1 n 



n 

j'=j'i J 

Note that we have decomposed the first term on the right-hand of (I9.4p into 
the sum 

E(l-%+A li n. 

J =.70+1 

This decomposition allows us to show that A 1 n is negligible and further to 
approximate the first term by a similar term in which the coefficients ■ n will 
be replaced by the Fourier coefficients - of the function 5*. 

Taking into account the definition of u in (12. 9p . we can bound u as 

Therefore, by Lemma [9. II we obtain 

lim sup ra 2fe + 1 A ln = . 
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Let us consider now the next term A 2 n . We have 

n 

Now by Lemma 19.21 and the definition ( 12. 8ft . we obtain directly the same 
property for A 2 , i.e. 

2fc ~ 

lim sup n 2fe +! |A 2n | = 0. 

Setting 

jl—l n 

i=io J=1 
and applying the well-known inequality 

(a + b) 2 < (1 + 5)a 2 + (1 + l/5)b 2 

to the first term on the right-hand side of the inequality (19. 5p . we obtain 
that, for any 5 > and for any p e V n , 

V s JS-S\\ 2 n <(l + 6)% n (S)n- 2k ^ 

+ A lin , + A 2in + (l + l/5)A3 iri , (9.6) 

where 

ii-i 
i=io+i 

Taking into account that k > 1 and that 

Ji < ^ + (A)^ n^(e n )- (2fc+2)/(2fc+1) , 
we can show through Lemma [9.31 that 

lim sup n 2fc +i A 3 n = . 
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Therefore, the inequality ( 19. 6 j) yields 

2fc — 

limsupn 2fe + 1 sup lZ n (S, S)/~fk{S) < limsup sup ■j k n (S)/jk{S) 

and to prove ( 15.41) it suffices to show that 

limsup sup % tn (S)/j k (S) < 1. (9.7) 

First, it should be noted that the definition (15.11) and the inequalities fl3.Tj) - 
( 13.81) imply directly 

lim sup \t n /f(S) — l| = . 

n^oo SeWf 

Moreover, by the definition of (A J ) 1<J<n in (15. 3p . for sufficiently large n for 
which t n >f(S), we find 



sup n^x = tt- 2 *^)- 3 */^ 1 ) < vr- 2fe (A fc r(S))- 



i>i Oi) S 

Therefore, by the definition of the coefficients (Qj),>i in (13.51) . one has 
limsupn^ sup SU p 7r 2k (A k r(S)) 2k/{2k+1 \l - Xj) 2 /^ < I. 

n-s>oo SeWf j>j 

Furthermore, in view of the definition (1 2 . 8 Q we calculate directly 



lim sup 



n ~ r 1 
n -2W+i J2X* - (A k f(S))^ / (1 - z k ) 2 dz 



0. 



Now, the definition of W k in (1 3 . 3 Q and the condition (13. 6 p imply the inequality 
(19.71) . Hence Theorem 15.11 □ 

9.3 Proof of Theorem I6TT1 

For any z = (z x , . . . , Zj)' G M n , we set 

1 d 
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Note that, due to the condition (j3.7p . the density ( 16. 3 p is bounded, i.e. 

f(v } z)<(2irgX n/2 - 
So through (16.20 we obtain that 

lim t(z) f{v,z) Vi { Zi ) = 0. 
Therefore, integrating by parts yields 

E(f B -r(0)) ft = / (? n (v)-r(z))^(f(v,zMz))dzdv 

J R n + l OZ i 

<} t(z)) $(z) ( / f(v, z)dv) dz = T i . 



Now the Bounyakovskii-Cauchy-Schwarz inequality gives the following lower 
bound 

E(f n - r(tf)) 2 > 7?/EeJ . 
To estimate the denominator in the last ratio, note that 

ft ( V| *)=£( V| *) + £*^ with = (5/5^) In /(«,*). 

From ( I6.ip it follows that 

Moreover, the conditions H 2 ) and (I6.4p imply 

(d/dzjofrz) = (d/d Zi )g 2 (x p S z ) = Ux^z) 
from which it follows 

E (f t (Y^)) 2 =F l + B l . 
This implies inequality ( I6.5P . Hence Theorem I6.ll □ 
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9.4 Proof of Theorem 14.31 

In this section we prove Theorem 14.31 Lemma 15.21 implies that to prove the 
lower bounds (I4.10p and (j4.11j) . it suffices to show 

liminf inf K (S n ) > 1 , (9.8) 

where 

K (S n ) = sup E s J\S n -S\\ 2 / lk (S). 

sew? 

For any estimator S n , we denote by S® its projection onto Wf, i.e. 
S® = Pv W k(S n ). Since is a convex set, we get 

\\s. v-sf > \\s° -s\\ 2 . 

ii n n — n n n 

Now we introduce the following set 

E n = { max max < d n } , (9.9) 

l<m<M n l<j<N J 

where (C m j) are i-i-d. jV(0, 1) random variables from ( 16.101) and the sequence 
{d n ) n>1 is given in the condition A 2 ). It is clear that the condition A x ) implies 

limsup n"P (Z c n ) = 0. (9.10) 
Therefore, we can write that 



Ko(S n ) > / — " A**(dz). 

J{z:S z , n ew*}ns n 7fc(A,J 

Here the kernel function family (S z n ) is given in ( 1631) in which N n = [In 4 n]+l 
and the parameter h is defined in (I6.26P and (I6.30P ; the measure [1$ is defined 
in f )6.13p . Moreover, note that on the set H the random function S$ n is 
uniformly bounded, i.e. 

ll^,Jloo= SU P \ S *,n( x )\ < V^C (9-11) 

0<IE<1 

where the coefficient t* is defined in (16. lip . 
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Thus, we estimate the risk TZo(S n ) from below as 

Ms n )>^ I v Sz j\s° n -s z j 2 ^(dz) 

X ^{*:S,, n 6W*}nS B 

with 

7 ;= sup lk (S). (9.12) 

By making use of the Bayes risk (16.131) with the prior distribution given by 
formulas (I6.10p . f !6.24p . f)6.26p and (16.30 p . for any fixed parameter < e < 1, 
we rewrite the lower bound for lZo(S n ) as 

TZ (S n )>£ ein (S° n )/ 1 * n -2nj 1 * n (9.13) 

with 

fi n = E ( 1 {w^ } + h%)(r + \\s*J 2 ) ■ 

In Section [6731 we proved that the parameters in chosen prior distribution 
satisfy the conditions A 1 )-A 4 ). Therefore, Propositions 16 . 3H6 .41 and the limit 
119. 10p imply that, for any v > 0, 

lim n v Q n = . 

n— >oo 

Moreover, by the condition H 4 ) the sequence 7* goes to jk(S ) as n — > 00. 
Therefore, from this, (I6.3ip and (19. 13ft we get, for any < e < 1, 

liminf inf n^rr <R, (S n ) > C 1 + £ ')(1 ~ g) 2fc+1 

n->oo Sn (l+£)2fc+T 

where e' is defined in (I6.29p . Limiting here e — > implies the inequality 
(EU). Hence Theorem i73j □ 
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